- This is how you can recover the fulltext of a subcorpus.
corpus("GERMAPARL") %>% # take the GERMAPARL corpus
subset(date == "2009-11-10") %>% # create a subcorpus based on a date
subset(speaker == "Merkel") %>% # get me the speech given by Merkel
html(height = "250px") %>% # turn it into html
highlight(list(yellow = c("Bundestag", "Regierung"))) # and highlight words of interest
- Inspecting the fulltext can be extremely useful to evaluate topic models: This is how you would highlight the most likely terms of a topicmodel using polmineR:
h <- get_highlight_list(BE_lda, partition_obj = ek, no_token = 150)
h <- lapply(h, function(x) x[1:8])
corpus("BE") %>%
subset(date == "2005-04-28") %>%
subset(grepl("Körting", speaker)) %>%
as.speeches(s_attribute_name = "speaker", verbose = FALSE)[[4]] %>%
html(height = "350px") %>%
highlight(highlight = h)